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Abstract 

We consider N Bernoulli random variables, which are independent conditional 
on a common random factor determining their probability distribution. We show 
that certain expected functionals of the proportion Ln of variables in a given state 
converge at rate 1/N as N ->• oo. Based on these results, we propose a multi-level 
simulation algorithm using a family of sequences with increasing length, to obtain 
estimators for these expected functionals with a mean-square error of e 2 and com- 
putational complexity of order e~ 2 , independent of N. In particular, this optimal 
complexity order also holds for the infinite-dimensional limit. Numerical examples 
are presented for tranche spreads of basket credit derivatives. 

Key words: Multilevel Monte Carlo simulation, large deviations principle, exchange- 
ability, basket credit derivatives 

1 Introduction 

This article is concerned with the behaviour of functionals of a large number of exchange- 
able random variables and the efficient numerical estimation of their expectation. The 
objective of this work is thus two-fold: to analyse the order of convergence in 1/N of ex- 
pected functionals for N -> oo, and to derive estimators for these expectations for which 
the computational complexity is asymptotically independent of N. 

First, we analyse convergence in the case of general Lipschitz and smooth functions p of 
the average of N exchangeable Bernoulli random variables as N goes to infinity. We then 
specify p further to certain piecewise linear functions and show that the convergence order 
is the same as in the smooth case. These results are relevant, for instance, if one wants 
to approximate the result for large but finite N by its limit. A number of applications 
comes from the credit risk literature. In [11], Vasicek derives an expression for the 
limiting distribution of portfolio losses in a Normal factor model, where default of a firm 
is indicated by its value process being below a default barrier at maturity of the debt. In 
the large portfolio limit, the randomness comes solely from a common market factor, while 
a law of large numbers holds for idiosyncratic components conditionally on this factor. 
Bush et al. , in [3J , extend this to a dynamic set-up where it is seen that the density of the 



1 



limit empirical measure of firm values satisfies a stochastic partial differential equation 
(SPDE) and can be used to approximate tranche spreads of basket credit derivatives; [2j 
gives an extension to jump diffusion models while [7J include extensions to heterogeneity 
and self-exciting defaults rendering the resulting equations non-linear. Further studies 
focus particularly on the tail of the limiting loss distribution, see [5], [9] and the references 
therein. 

A driving practical motivation for investigating the limiting behaviour is that the 
original sequence of random variables is costly to simulate, because of the large number 
N of underlying processes, often required over large time horizons. Moreover, often many 
Monte Carlo samples are necessary for sufficiently accurate estimation of, for instance, 
expected tranche losses of credit basket. This paper takes a different tack and develops 
a simulation method where the computational complexity is asymptotically independent 
of N. A small tweak of the algorithm can also be used to approximate the limit obtained 
for N -> oo. 

More concretely, it turns out that an interpretation of the multi-level Monte Carlo 
approach (see [8]) in the present context allows us to construct estimators based on 
sequences with increasing lengths and a number of samples which decreases faster than 
the length increases, such that the overall computational complexity is essentially no 
larger than for fixed small N. 

A conceptually similar though distantly related approach is used in pQ, where the 
multilevel idea is applied to a sequence of martingales to estimate a dual upper bound 
for the value of an early exercise option. In that setting they are able to show, as we do 
here, that the achievable complexity is not substantially larger than that of a non-nested 
simulation. The general problem of estimating conditional expectations through nested 
multilevel simulation is addressed in [5]. There, further extrapolation is used to reduce 
the bias of estimators, while here we will propose an improved estimator which reduces 
the variance of higher level estimators. 

This article is organised as follows. In Section [21 we introduce the setting and outline 
the main convergence results, explaining how they can be used to construct efficient 
estimators. The first key result on the convergence order of expected functionals is proved 
in Section [3l with numerical illustrations from an example of a basket credit derivative 
presented in Section |H In Section we introduce in detail two multilevel simulation 
methods and derive bounds on their computational complexity to achieve a prescribed 
accuracy. Finally, in Section [6] we present numerical results illustrating the efficiency 
gains achieved through multilevel simulation in this context and Section [7] concludes. 

2 Set-up and main results 

In this article, we are concerned with the behaviour of "loss" variables describing the 
fraction of N random variables in a certain state, and expected functionals of this loss 
variable, as N goes to infinity. The application we have in mind, and for which we will 
present numerical illustrations, is that of a basket of defaultable firms, and then the loss 
is the fraction of firms which default over a certain period. 

More precisely, on a probability space P), consider a sequence of Bernoulli 

random variables Y{, i e N, and a random variable L taking values in [0,1]. If required 
we write f2 = Oy x Q.l where canonically we could take Oy = {0,1}^ and $7^ = [0,1]. 
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The probability measure P is constructed as follows. The random variable L is generated 
according to its marginal law P^ and then, conditional on Tl, the cr-algebra generated 
by L, the Y{ are independent random variables with law given by 

P[Y { = l\T L ] = L. (2.1) 

Thus for each n e N 

F(Y 1 = y 1 ,...,Y n = y n ,LeB)= f Z a »(l - l) n ~ s ^ L {L e dl), V ffi e{0,l},5c[0,l] 

where s n = YJH=xVi- We will often write P|^ = P(. | J~l) for the conditional law of the Yi 
given Tl and Ei£ for the associated conditional expectation. In the setting of defaultable 
firms, Yi = 1 iff the i-th firm defaults, and L is a global factor modelling the common 
tendency of firms to default. We define the loss variable to be the proportion of Bernoulli 
variables in state 1 

1 N 

L N = ^T,Yi. (2.2) 
We consider a Lipschitz function p and random variables P and P/v defined as 

P e p(L), (2.3) 
P N = p(L N ). (2.4) 



In particular, we will study p of the form 
p(l) = [l-K 1 ] + -[l-K 2 ] + ^< 



l<Kt, 

l-K x K x <l<K 2 , (2.5) 
K 2 -K x 1>K 2 , 



where [x] + = max(x,0) denotes the positive part and < K\ < K 2 < 1 are constants. 
In credit derivative pricing, the particular shape of the function p in (|2.5|) measures the 
losses in a certain tranche with attachment point K\ and detachment point K 2 , and 
its expectation is the building block for formulae for CDO tranche spreads. A typical 
CDO pool consists of N - 125 firms, while typical loan or mortgage books can have 
substantially more obligors, and it is therefore practically relevant to understand the 
behaviour of expected functionals for large iV and to devise computationally efficient 
estimators. 

By a conditional version of the strong law of large numbers and the continuity of p 

L N -> L foriV^oo, P| L -a.s., (2.6) 
P N -> P foriV^oo, P [L -a.s. (2.7) 

This convergence will also hold in L 2 (Qy ( see Lemma l3.ip . 

We study here the convergence rate of P^-P and will prove the following two results. 
The first statement for Lipschitz and smooth functions p is a relatively straightforward 
consequence of (|2.ip and the easily computable L 2 convergence rate of Ljy. The second 
result shows that for a specific p which is only piecewise smooth we can still obtain the 
same convergence order as in the smooth case and with explicitly computable bounds. 
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Theorem 2.1. Let P and P/y be defined by (|2.3p and (|2.4p . respectively, and assume 
that p is Lipschitz with constant c p . We have that 



c 2 

Far^-P] < (2.9) 
i/j moreover, p is differentiate and the derivative has Lipschitz constant C p , then 

\®[Pn-P]\ < (2-10) 

Theorem 2.2. For p defined in \2. 5]) . 2/ i/ie cumulative density function (CDF) Fl of L 
is Lipschitz at any K\ > and K2 < 1 with Lipschitz constant cl, i.e., 

\F L {K j )-F L {l)\<,c L \K i -l\ (2.11) 

/or j = 1,2 and a/I / e [0, 1], then 

ic L y/7r 



\E[P N -P]\< 



N 



Note that if L has a density function which is bounded, then the CDF is certainly 
Lipschitz. The fact that we only need the Lipschitz property at K\ and ~K% will be useful 
for the applications considered later. 

Taking the two Theorems together, order 1 for the convergence of expectations also 
follows for piecewise smooth p which are Lipschitz overall, provided Fl is Lipschitz. 

These Theorems show that expected functionals for large or infinite iV can be succes- 
sively approximated by those with smaller N. Combining this with a control variate idea 
leads to multilevel simulation with a substantial variance reduction for large N. Specifi- 
cally, the above results imply that for Lipschitz p we have |E[P/v - Pmjv]| ^ ci/vN and 
Var[P]y - Pmn] ^ C2/N for any positive integer M with some constants c\ and ci- We 
can consider a sequence Ni = M l , I e N, with corresponding ifi' = Ljq l and P^ 1 ' = P/v r 
Translating the central idea in [8] to this setting, we estimate in the decomposition 

1 

E[P W ] = E[P (0) ] + Y, E[P W - P^ 1 )] (2.12) 

k=l 

every summand separately with independent samples of different sizes. These can be 
chosen to obtain an optimal allocation of computational cost for a given overall mean- 
square error (MSE). The general construction in [8] immediately gives the following result. 

Proposition 2.1 (cf. [8], Theorem 3.1). Let P, as above. If there exist independent 
estimators Z[ based on n\ Monte Carlo samples, and positive constants a,(3,ci,C2,cs such 
that a > 5 and 



i) E[P (0 -P] <ciM 



-Oil 



11) E[Zj] = 



E[P(°)], 1 = 

E[p(0-p0-i)], />0 
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in) Y[Z t ] <c 2 nJ 1 M^ 1 

iv) C[ < c^riiNi, where C\ is the computational complexity of Z\ 

then there exists a positive constant C4 such that for any e<e~ l there are values K and 
ni for which the multilevel estimator 

Gk = Y,Zi, (2.13) 
1=0 

has a mean-square-error with bound 

MSE = E [(Gk - E[P]) 2 ] < e 2 

with a computational complexity C with bound 

c 4 e~ 2 , (3 > 1, 

('< c 4 £- 2 (log£) 2 , /3 = 1, 

C4£ -2-(1^8)/a ) 0</?<L 

Corollary 2.1. Let Pjy and P be as in \2. 3\) and \2.J$ , and assume p is Lipschitz. 

1. There is a multilevel estimator for E[P] with MSE e 2 with computational complexity 
C <c 4 (loge)V 2 . 

2. For all N, there is a multilevel estimator for E[pv] with MSE e 2 with computational 
complexity C < C4(log e) 2 e~ 2 , where C4 is independent of N. 

Note that only order 1/2 is required for the convergence of expectations in Proposition 
12.11 and that the complexity is then dictated by /3, the case /3 = 1 implied by Theorem 
12.11 for all Lipschitz payoffs being a boundary case. 

We point out that for fixed N the standard (i.e., single level) Monte Carlo estimator 
has a complexity C < c e~ 2 , however, c increases linearly in N. The significance of the 
multilevel estimator is that the constant C4 is independent of N. 

For the specific p as in (|2.5p , we can exploit the piecewise linearity of p to construct 
multilevel estimators with even better complexity, by making the following observations: 
The summands in (|2.12|) are unchanged if we replace = p(L^ k ^) with any of 

p(Lm ^) for m = 1,. . . , M, where 

Lt 1 ^^- £ Yi+im-DN^- (2-14) 
Wfc-i i=i 

This is a direct consequence of the exchangeability. Now, 

1 M 

L W = _L y l^- 1 ) (2.15) 

and, if all L^ 1] lie in the same interval [0,K{], (Ki,K 2 ] or (K 2 ,l], also p( fc > = P {k ~ 1} , 
where 

?(fc_1) s TfZ P m _1) = TfZ (2-16) 
JW m=l JW m=l 
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since p is linear in these intervals. Because of E[p( fc 1 ^] = E[P^ we can now write 
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E[P il) ] = E[P (0) ] + £ E[P (fc) - P ( '], (2.17) 

fc=l 

and if we estimate the individual terms in the sum independently in the multilevel spirit, 
there is only a variance contribution from a specific sample of the k-ih term if at least 
two Pm lie in different intervals. For large k, the probability of this is small, and we 
will be able to show the following result. 

Theorem 2.3. For p as in \2. 5\) . let P^ as in Proposition \2. 1\ and P^ 1 as in 112. 16\) . 

If the CDF Fl of L is Lipschitz with Lipschitz constant cl, then 

Var[pU -P 0_1) ] < (2.18) 



where c 2 = c L 4a/M^(V2 + v^^f (M 2 + 6M + 1). 



Here and throughout the paper we give explicit expressions for the constants. These 
should not be regarded as optimal in any sense. 



Corollary 2.2. For Lipschitz Fl and p as in \2. 5)) . there is a constant C5 and multilevel 
estimators for E[P] and E[P/v] with MSE e 2 with computational complexity C < C5 e~ 2 . 

The point here is that we managed to remove the logarithmic factor present in Corol- 
lary 12.11 and that C5 does not depend on N. 

3 Proof of convergence rates 

We first prove Theorem 12.11 which contains statements in the general and smooth case. 
The rest of this section is devoted to the proof of Theorem 12.21 dealing with a specific 
non-smooth payoff relevant to our application. 



Lemma 3.1. Let P/v and P be as in \2. 3)) and ( f£.^| ), and assume p is Lipschitz with 
constant c p . Then 

%[(^W) 2 ]<^- 
Proof. Since the function p in (j2.3j) is assumed Lipschitz and E^Ltv] = L, we have 

2 2 

E ]l [(Pn - P) 2 ] < c 2 p E {L [(L N - L) 2 ] = c\ Var[L N \F L ] = ^ Var[Y^£\ = ^L(1-L). 

For L e [0, 1], L(l - L) < ±, which gives the result. □ 

Proof of Theorem \2.1\ Equation (j2.9[) follows directly from Lemma I3.1| and then, by 
Holder's inequality, 



|E[P ~P N ]\< yJm\d(PN - P) 2 ]] < 



2VN 
6 



For differentiable p, we can write 

E[p(L) -p(L N )] = E[p'(L)(L - L N )] + E[r(L,L N )], 
with some remainder r, where the first term on the left-hand side is 

E[E lL [p'(L)(L-L N )]] = 0. 

For a Lipschitz derivative of p, 



1 



\2 



\p(x) -p(y) -p'(x)(x -y)\ < -C p (x-y) 
for all < x,y < 1 and the remainder term satisfies 

\E[r(L,L N )]\<^, 

from which (|2,10p follows. □ 

Now, we turn to the proof of Theorem 1 2 . 2 1 and show a few Lemmas first. We divide the 
ranges of L and Ljv into the three intervals I\ = [0, ifi], I2 = (K2,l] and I3 = (K\,K<i\, 
in each of which the function p from (|2.3|) is linear; the point being that the probability 
of L and Ljv lying in different intervals is small for large N, and the expected difference 
of P - Pjy is small if they are in the same interval. The following Lemmas quantify this. 

Lemma 3.2. For j = 1,2, we have 

P| L (Le/„L;ve/, c ) < e^"^ 2 , (3.1) 

P^Lei-Ljve/,) < l i£f; e^» 2 . (3.2) 

Proof. This is a standard large deviations result. By Theorem 2.2.3 in [B], p. 27, and 
Remark (c) thereafter, for ^-independent and identically distributed random variables 
{Yi)\<i<N with E[Yi I J~l] = L, we obtain that if < L < Kj, 



L (L N >K j )<e~ N ^\ 



and if Kj < L < I, 



F lL (L N <Kj)<e- N ^ L > K i\ 
where the rate function g(L,Kj) is given on p. 35 in [BJ as 

g{L,K j ) = K j log^ + il-Kj) log(i^), 

since Yi are Bernoulli distributed random variables with P(Yi = 1 | Tl) - L. It is 
straightforward to check that for all L e (0, 1) 

d(L, Kj) > (Kj - L) 2 . (3.3) 

Hence, by for < L < Kj 

F ]L (L N >K 3 )< e ~ N9 ^ K ^ < e ~ N ^ K ^ 2 , (3.4) 

and similarly for Kj < L < 1. These estimates are clearly true for the degenerate cases 
L = and L = 1. From this the result follows. 

□ 
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Lemma 3.3. Let p be as in \2. 5)) . If A^ is the event that Ln and L are in the same 
interval and A C N its complement, then 

E[(P N - P)1 An ] = -E[(L N - L)l A c N l {Leh} ]. (3.5) 

Proof. By splitting the range of L into the different intervals, 

E[(P N -P)U N ] = ZE[(P n -P)1 An 1 {L£I . } ] 

= E[(L N - L)l AN l {Leh} ] 
= -E[(L N -L)l A o N l {L , Is} ], 

where we have used in the second line that P/v = -P if both Ln and L lie in either I\ or I2 
and that Pjy—P = L^-L in ^3; in the last line that E^[L^-L] = and 1 An +1a c = 1- D 

Lemma 3.4. Let A C N be as in Lemma \3.3l If the CDF Fl of L is Lipschitz at Kj, j = 1,2, 
urai/i constant cl, then 



E 



(%[^v]) 



(3.6) 



Proof. Let = (i^i, 1] and I 2 = [0, -K2] be the complements in [0, 1] of I\ and I2, then 

c {L e /! , Lat e A c } u {L e J 2 , € / 2 C } u {L e ij, Ljv e /1} u {L e / 2 C , L w e If} 
and therefore 

%[^v] < L2v6if]+P|i[L6/ 2> L N €ll]+F ]L [LeI1, L N e h] 

+ P, L [Le/ 2 c , Ljve/ 2 ]. 



By (J37L]), ([321) w e have 



and we obtain 



E 



< 2a I E 



e- N 2 



+ E 



(L-K 2 y 



e~ N a 



If we extended i*L by and 1 from [0, 1] to M then, for j = 1, 2, we have 



E 



e 2 



J -c 



e~ N - 2 



dF L {l) 



(3.7) 



(3.8) 



= N 



00 



-iV 



(1-KjY 



-F L (l) dl 



(3.9) 



< NFi{Kj) / (l-Kj)e- N ^^- dl + c L N j (I - Kj) e 2 dl 

J OO J 00 

clVZtt 



N 

where we used the Lipschitz property of the CDF after (|3.9p and then integrated exactly. 
The result follows directly by insertion in (|3.7p , □ 
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Proof of Theorem \2.2l By the tower property of conditional expectations and Jensen's 
inequality, we have 



\K[(Pn - P) 1a°J < E[|E| L [(Pjv - P) U^]|]- (3.10) 
Then Cauchy-Schwarz gives 

\E[(P N - P) l A oJ < E, L \{E[(P N - P) 2 ]Y (F {L [A c N ]y 



(3.11) 



By Lemmas 13.11 and 13.41 we obtain 



|E[(P^-P)1^]|<^|^. (3.12) 
Similarly, using Lemma 13.31 and the same argument as above, 

\E[(Pn-P) U n ]\< CJ ^, 
from which the statement follows. 

4 An application and numerical results 

To illustrate the theoretical rate of convergence, we study numerical results for expected 
tranche losses of a synthetic CDO for an increasing size N of the underlying CDS pool. 

We consider a structural factor model (see, e.g., [IQIG]), where the distance-to- default 
of the i-th firm, i = 1, . . . , N, evolves according to 



□ 



X\ = Xi + pt + ^/T^Wi + y/pBt + Jt, t>0, (4.1) 
where p e [0,1), f3 given. Here, B is assumed to be a standard Brownian motion and 

(~i~p ____ ___ 

Jt - E^ =1 *nfc, where CPt is a compound Poisson process with intensity A and 11^ are 
independent Normals with mean /in and variance cr^, while all W' b are independent 
standard Brownian motions and independent of B and J. Thus B and J model factors 
affecting the whole market, whereas W l are idiosyncratic effects. 

The i-th firm is considered to be in default if its distance-to-default is below at any 
one of the observation times Tj = jq, q = 0.25 (quarterly), up to T20 = T = 5, the assumed 
maturity of the debt here. We introduce the default time Tj and Bernoulli random variable 
Yi indicating default of the i-th firm before T, by 

= inf({t e {T 1 ,...,T M }:^<0}u{oo}), 

Yi = l{ r i< T j.. 

For the numerical experiments, the initial values Xq are drawn independently from a 
Normal distribution, 

X l ~ N(p Xo ,a 2 Xo ), 

where the mean p,x = 4.6 and standard deviation o~x = 0.8 are obtained from a calibra- 
tion to iTraxx data as detailed in [2], as are p = 0.13, A = 0.04, /in = -0.5 and = 0.17. 
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Figure 1: Top row: Empirical CDF Fl for different values of /io = p-x (left) 
and different pa (right). The values of pa are arrived at by (|4.2p from (p, A) e 
{(0.03,0.001), (0.1,0.002), (0.35,0.0035), (0.35,0.0351), (0.8,0.1)}. All other parameters 
are fixed as given in the text. The plots in the second and third rows are zoomed into 
the ranges of L close to and 1, respectively. 
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We illustrate the CDF Fl of L for different parameters in Figure [TJ To this end, we 
generated samples of L by simulating the SPDE satisfied by the limit empirical measure 
of distances-to-default for N -> oo, see [2] for details. It appears that Fl is Lipschitz 
in (0,1) but that the derivative at and 1 can become very large in certain parameter 
ranges for po - ^x and overall instantaneous correlation 

PA = (p + C)/(l + C), C = A(/4 + 4), (4.2) 

between X\ and Xf (see [2]). 

For large values of po, the probability of defaults becomes very small and the density 
of L is concentrated around 0. For pa approaching 1, all Yi become identical and therefore 
either all or none of the firms default, such that here the density of L is concentrated at 
and 1. In the degenerate case pa - (i.e., p = A = 0), L is deterministic, the measure is 
atomic and Fl a step function. 

The empirical evidence thus suggests that Fl is Lipschitz in the range (0, 1). Given 
that Theorem 12 . 2 1 only requires the Lipschitz property at interior values Kj, the conditions 
appear to be satisfied and the Theorem to apply in this setting. Even in situations where 
Fl has a bounded derivative at and 1, the fact that only the Lipschitz constants from 
K\ and K2 enter into the estimates gives us substantially smaller bounds. 

We now move on to present numerical results for the payoff function p from f)2.5[) 
illustrating the convergence as the number of firms N goes to infinity. We consider 
portfolios consisting of Nk = M k = 5 fc companies for k = 1, . . . , 7. 

To include a recovery value of defaulted firms in the model, we rescale by (1 -R), 
where R = 0.4 is the recovery rate. Equivalently, we pick (K\,K2) = (1 - R)^ 1 (a,d) in 
(USD and (a,d) e {(0, 0.03), (0.03, 0.06), (0.06, 0.09), (0.09, 0.12), (0.12, 0.22), (0.22, 1)} as 
the attachment and detachment points for iTraxx tranches, and then study {l-R)p{L^). 

A straightforward Monte Carlo estimator for expected tranche losses E[p( fc '] is then 
given by 

1 n 

G k = -Y,(1-R)P(L^), (4.3) 
L (W) = ±XY<*\ (4.4) 

N k i= l 

where (Y^ ) are independent samples of Yi, i.e., corresponding to independent paths 
for B, W and J. There is no time discretisation error as (|4.1|) can be sampled directly. 
However, it turns out to be computationally prohibitively expensive to choose n, the 
number of samples, large enough to produce estimators with sufficiently small RMSE to 
allow us to distinguish between Gk and Gk+i for large k. 

We therefore use the multilevel simulation approach outlined in Section [2] and detailed 
further in Section [5) The point is that the differences Gk+i - Gk are simulated directly in 
the multilevel approach. Therefore, we approximate \G - Gk\, where G = lim^oo Gk, by 



Sk = \Gk - Gk\ 



K 
l=k+l 



(4.5) 



for k < K, where Z\ is an estimator for E[P^ - P^ 1 ^] as used in the construction of 
Gk in (|2.13p (precisely, we used the estimator Z\ defined later in (|5.ip ). The difference 
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between Sk and \G-G}.\ for k = K - 1 is given by Gk-i - Gk ~ {Gr-i - G)(l - 1/M) and 
for k = K - 2 by G K -2 - G K ~ (Ga'-2 -G){1- 1/M 2 ). Given M = 5 in our examples, the 
error due to this approximation will be seen to be smaller than the estimation error. 

The results are shown in Figure We plot the logarithm of to base M, together 
with the sample standard deviation of the the multilevel estimators Gk (see (|2.13p ) and 

yk = -k + y , (4.6) 

where yo is a suitably chosen constant, to verify the predicted convergence order empiri- 
cally. The data points appear to be in good agreement with first order convergence. 

5 A multilevel method and its numerical analysis 

In this section, we describe and analyse a multilevel simulation approach for the esti- 
mation of expected functionals of the form (|2.3p and (|2.4[) . the latter with a particular 
emphasis on the case of large N. 

The multilevel Monte Carlo method proposed by Giles in [8] estimates the expected 
value of a functional of the solution to a stochastic differential equation obtained by 
a timestepping scheme. It performs computations on different refinement levels I with 
time steps hi = h$M~ l for M > 1, such as to minimise the overall computational time 
of the Monte Carlo estimator for prescribed mean square error (MSE). Since the MSE 
consists of a Monte Carlo error (variance) and a discretisation error (bias), the method 
controls both the number of samples ni on level I, to bound the Monte Carlo variance of 
order O^J 1 ), and the finest L with time step h~ L on which to approximate the SDE, in 
order to reduce the bias. The multilevel method is based on two premises: Monte Carlo 
estimators for an increasing number of time steps converge at a certain order in hi, and 
the computational cost needed to calculate an estimator increases with nihj 1 . In this 
approach, estimators obtained with a smaller number of time steps are used as control 
variates for estimators with a larger number of time steps, which significantly decreases 
the computation time. 

To obtain a complexity result for an estimator of E[P] with P from (|2.3p . we substitute 
hi by iV^ 1 in Theorem 3.1 of [5] and immediately obtain Proposition 12.11 from Section [21 
We hence define estimators Zi by 

Zpn^fpW'-pW)), (5.1) 

i=i 

where 'c' denotes a 'coarse' estimator on level I, i.e., using only iVj_i instead of Ni Bernoulli 
random variables, precisely, 

Ni 

p(W = p (L^ j) ), where = Nf 1 Y l Yy ,3 \ (5.2) 

i=l 

= p(L^), where L^> = Nf_\ £ Y™, (5.3) 

i=l 

where K , j = 1,... ,ni, are independent samples of Y{ for fixed level I and independent 
across levels. They are constructed from a loss factor L™' (with the same distribution 
as L, independent across I and j) in the same way that Yi is constructed from L. 
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Figure 2: Shown here is log^ S^, where Sk given by (|4.5p is an estimator for \E[P^-P]\. 
The various plots are for tranches ranging from [0%-3%] to [22%-100%], of a CDO basket 
consisting of = M k = 5 k companies, where k = 1, ...,6. The comparison with the 
predicted trend yk from (|4,6p confirms the first order convergence. Included is also the 
standard deviation of the estimated tranche loss Gk- 



13 



By direct inspection, for this construction of Z\, Assumption ii) holds in Proposition 
12.11 From Theorem 12, 1[ we know that [I)] holds with a = 1/2 for general Lipschitz p. 
Clearly, the computational effort to compute Z\ is proportional to n/iVj as required in 



iv 



Finally, iii) holds by the following simple application of Lemma 13. 11 

Proposition 5.1. Let P"' = P^ l as per /jj2.4\ ), where p is Lipschitz with constant c p , then 

Var[pM - P«-»] < cl^±. (5.4) 

Proof. This follows directly from 

%[CP (0 -P (w) ) 2 ] = E, L [((p( / )-P)-(P^ 1 )-P)) 2 ] 

< 2(E| L [(P«-P) 2 ]+E| L [(P('- 1 )-P) 2 ]) (5.5) 

by Lemma 13. II and taking expectations over L. □ 
We have therefore proven the first statement of Corollary 12.11 



In practice, it is also relevant to be able to compute E[P/v] efficiently for finite N. 
It is clear that for fixed N, the complexity is bounded by c e~ 2 for some c > 0, but for 
a naive (single-level) estimator the constant c will increase with N. From the proof of 
Theorem 3.1 in [8] it is clear, however, that there is a multilevel estimator with a priori 
bounded upper level K which satisfies the second statement in Corollary 12.11 

We now propose a multilevel estimator with even lower variance, based on the faster 
decay rate 3/2 in Theorem 12.31 which we prove subsequently. Specifically, we define 
estimators Z\ by 

Z^n^t(P (l ' j) -P {l ' j) ), M 

3=1 V ' 

where P^' J ) is defined as in ()5.2|) . but instead of Pc we use 

_, M Ni-i 

P ( J) - M- 1 Y,V{L^), where L™ = Nf_\ ^ Y$l_ m ^, (5.7) 

and where the rest of the set-up is as earlier. 

It is clear that Z\ satisfies ii) in Proposition 12 . 1 1 and that the computational complexity 



is still bounded as required per iv) In fact, as the main computational cost is typically 
in sampling Yi, the computational complexity is virtually identical to that of Z\. In 
particular, if we evaluate (|5.2p by using (|2.15p and the already computed (|5.7p . the 
difference in evaluating Z\ and Z\ is an 0{M) cost, i.e., independent of N[. Now, given 
Theorem 12.31 we have that 

Var[Zi] < cn^M'^ 21 , (5.8) 

for some c, such that we are in the first regime in the complexity result of Proposition 
12.11 i.e., we have optimal complexity order. 

The remainder of this section is devoted to the proof of Theorem 
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Lemma 5.1. Assume the CDF Fl of L is Lipschitz with constant cl- Let B^ 1 ' be the 
event that ift' lies in the same interval as B^ ,c its complement, then 



E 



(%[£ (0 ' c ])' 



C 



where C = c L 4 v / 7r(\/2 + \/M). 

Proof. Let A® again be the event that and L are in the same interval, A^' c its 
complement. Then from 

B (D,c c (^)nA( / - 1 )' c )u(^ (0 ' c n^- 1 ))u(^ (0,c n^- 1) ' c ^ 



follows 



) L [B il) ' c ] < P| L [A (/) n A {l ~ x) ' c ] + P [L [A (/) ' C n A^ 1 *)] + ¥ ]L [A^' C n A (,_1) ' c ] 
< 2P [L [A (i) ' c ]+Pi L LA ( ^ 1) ' c ], 



which leads to 



E 



(f il [b^ c ]) ] 



< V2E 



(F ]L [AW' C ]\ 2 +E (f ]l [A^' c ]) 



By Lemma 13,41 we obtain the result. 
Lemma 5.2. For P, P^ and P (W) 



as above, p Lipschitz with constant 1 
3 



|LLV ; J 16iV 2 \ 3NJ 16Nf 



16Nf 

C|i[(P (z) -P (z_1) ) 4 ] < ^, 



%[(PW-P 0_1) ) 4 ] < 



□ 

(5.9) 
(5.10) 
(5.11) 



where C = |(AP + 6M + 1). 

Proof. See Appendix [XI □ 

Proo/ o/ TheoremUM Let P (z) be the event that all lie in the same interval, 1 < 

m < M, and p(')' c its complement, then 

E[(P« -P (W) ) 2 ] = E[(P« -P (W) ) 2 1^(0] +E[(P (0 -P (W) ) 2 W]. 
By ()2.15p and linearity of p in each interval, we have 

E[(P^ -P^fl^] = 0. 
By Cauchy- Schwartz, we have 



Ei 



1(P {1) -P (l - 1} ) 2 l E ou] < - P (W) ) 4 ])" (%[^ (0 ' c ])' , 
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hence, 

E[(P^)-P (W) ) 2 1 E(0 , C ] <E 

By Lemma 15.21 we have that 

(%[(P«-P (W) )*])^< 

where ci = |(M 2 + 6M + 1). 

If we denote by B$ the event that I„ ^ and lie in the same interval, then 



N, 



(5.12) 



At 



(l),c 
m 



m=l 



and therefore 



. |£ (^W.c) < £ P |L (PW' C ) = MP| L (P^' C ). 

m=l 



By Lemma 15. 11 this gives 



E 



(P|l[£ (0 ' c ]) 



C2_ 



where C2 = C£ 4^/M7^(^/2 + vM). Together with (|5.12p . we obtain the result. 



□ 



6 Multilevel tests 

In this section, we present multilevel simulation results based on the estimators from the 
previous section and illustrating the theoretical findings from there. We return to the 
example from Section 0] and estimate expected tranche losses for credit baskets with an 
increasing number of firms iVj = M l . 

For the estimator Z\ from (|5.ip . an upper bound for the variance - although not 
a sharp one - is analytically known from (|5.4|) and we could use that to determine the 
number ni of samples on level I which is required to bring the variance contribution under 
a desired threshold. For the improved estimator Z\ from (|5.6p . however, the bound in 
(|5.8p contains the unknown Lipschitz constant of the CDF of Fl via Theorem 12.31 In 
order to determine the optimal allocation n\ , we use the following algorithm as per [8]. 
In contrast to there, the upper level K is fixed here which simplifies the stopping criterion 
somewhat. 

1. Start with k = 1. 

2. Estimate the variance 14 of a single sample using n& = 10 4 realisations. 

3. Calculate the optimal number of samples, nf, for 1 = 0,1,... ,k, using 




where 7 2 is a chosen upper bound of Var[Gx]- 
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4. Draw extra samples for each level according to n ; *. 

5. If k < K, set k - k + 1 and go to 2. 

6. If k = -fT, finish. 

Remark 6.1. As per J2J/ ; choosing n>* fry (|6.ip . guarantees that the variance Var[Gx] is 
bounded by j 2 , since 

Var[G K ] = EKr^^Ef^V^EV^) ^ = 7 2 - 

A side effect is that, for k < K , the variance is smaller than for k - K, since 

Var[G k ] = X(n l ) Vi < 7 K j= - 
i=i Li = iVVi^i 

Hence, if we compute estimators Gk for all k as a by-product of Gk , the variance is the 
smallest for G\ and then for Gk, k = 2, . . . , K , gradually reaches the upper bound 7 2 . This 
effect can be observed in Figure\^D. 

In Figure [3] we show results for the same parameter setting as in Section H] and only 
for the equity tranche. Results from other tests were very similar and did not show any 
noteworthy additional effects. In order to easily see the rate of convergence in OA., we 
plot the logarithm of V[ to base M, together with 

fk = -pk + f (6.2) 

for different values of /?. The estimated slope is j3 ~ 1 for the original estimator and 
/3 ~ 3/2 for the improved estimator, which agrees with the theoretical findings. The order 
of convergence of \E[P^ - is a ~ 1, which also agrees with the previous results. 

As can be observed in Figure [3lC, he number of samples ranges from 150 millions for k = 1 
to 34000 for k = 7. The improved estimator gives further reductions in computational 
time: the total number of samples ranges now from 35 millions for k = 1 to only 350 for 
k = 7. The standard deviation of G^ is an increasing function of k, and is less than or 
equal to the chosen upper bound 7 = 4 x 10~ 6 . 



7 Conclusions 

A main focus of this paper was the construction of an efficient simulation algorithm for 
functionals of a large number of exchangeable random variables. For a specific set-up, we 
were able to demonstrate optimal complexity order by theoretical analysis and numerical 
illustrations. 

The results from the previous section show that the computational savings can be 
significant in situations of practical relevance. As seen from Figure 0C, already for 
TV" = 125 (i.e., k = 3), the size of a CDO basket, the required number n.3 of samples on this 
level is reduced by about two orders of magnitude compared to the number of samples for 
k = 1, n\. It is roughly this number which would be required for a standard (i.e., single 
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A. Variance of a single Monte Carlo sample 



B. Estimated mean of higher level multilevel estimators 




C. Optimal number of samples x10~ 6 D. Standard deviation 




I 



Figure 3: Multilevel results for the expected loss in the equity tranche of a CDO basket 
consisting of Nj~ companies, Nk = M k - 5 k , k = 1, . . . , 7. Overlined quantities refer to the 
estimator Z\ from (|5.6p . all others to the standard estimator Z\ from (|5.ip . A. Variance of 
a single Monte Carlo sample, Vi and Vi, together with a predicted trend, fi, given by (j6.2|) . 
where f3 = 1 or f3 = 3/2. B. Mean at level I, Z\ and Zi, and a trend, yi, defined by (|4.6p . 
with slope -1. C. Optimal number of simulations in both cases, n ; * and nj , calculated 
according to (|6.ip for k = K = 7. D. Standard deviation of multilevel estimators 
defined in ()2,13p . and similar for Gk, with their chosen upper bound, 7. 
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level) estimator on level 3 for a comparable variance achieved by the multilevel estimator 
at substantially lower cost. 

We would expect there to be scope to apply the presented nested simulation approach 
to a wider range of settings beyond the particular application studied here. An interest- 
ing extension would be to the model from [7], where the analysis requires further tools 
accounting especially for the heterogeneity of the basket, resulting in non-exchangeability. 
While our motivation comes from credit baskets and some of the later results are specific 
to piecewise linear functionals encountered in the valuation of basket credit derivatives, 
there appears to be a wider relevance of the main approach to the simulation of certain 
functionals arising in large interacting particle systems and elsewhere. 

Acknowledgement: We thank Mike Giles for suggesting the improved estimator. 
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A Moment computations 



Proof, [of Lemma I5.2j We begin by showing (|5.9p and then deduce (|5.10p and (|5.1ip . We 

have 

| pW_p| < I L®-L\, 



where 



Hence, we get 



1 N t 
N l i=l 



%[(P«-P) 4 ]<%[(L«_ L )4 ]=E 



N, 



A 7 ^ i=l 



In 



As L is ^i-measurable and the Yi are independent and identically distributed given Tl 
with E|i[li - L] = 0, we have 



1 



E, L [(L^-L) 4 ] = -jE, £ 

JV « L 2=1 



E(^-^) 4 + 3^(y J -L) 2 (y J -L) 2 

^ ((1 - L) 4 L + L 4 (l - L)) + ((1 - L) 2 L + L\l - L)Y 

3L 2 (1-L) 2 L(1-L)(1-6L(1-L)) 



Nf 



Using the fact that < L(l - L) < 1/4 we have the required bound in (j5.9|) . 

For (|5.10p . observe that there are many ways of estimating this fourth moment; we 
choose the following 

(p(0_p0-l))4 = h p (i) _(p(i-i) -p)) 4 

< 2 ((pW - P) 4 + (P^- 1 ) - P) 4 ) + 12(PW - P) 2 (p( / - 1 ) - P) 2 . 

Therefore, using Cauchy-Schwarz on the last term and applying (|5.9p we have 

E| £ [(PW-P< ,_1 )) 4 ] < 2(E [L [(PW-P) 4 ]+E, L [(P( / - 1 )-P) 4 ]) 

+ 12 %[(P« - P) 4 ] 1 / 2 E| L [(P^ 1 ) - P) 4 ] 1 / 2 
7 7 42 

< + 7T- + 



8iV 2 8N^ 8NiN[_i 



as required to obtain ()5.10p . 
Finally, (|5,lip follows from 



%[ (p(0-p^) 4 ] = ^[(il^pw-pa-i))^ 



< E 



\L 



J_ y (p(0 _p0-i))4 
M m=1 



= E |L [(P«-P( / - 1 )) 4 ], 



and an application of (|5.10p . 



□ 
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